Statistical string similarity model for information linkage

نویسندگان

چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Statistical Model for Flexible String Similarity

This paper proposes a novel framework for image retrieval. The retrieval is treated as searching for an ordered cycle in an image database. The optimal cycle can be found by minimizing the geometric manifold entropy of images. The minimization is solved by the proposed method, fast active tabu search. Experimental results demonstrate the framework for image retrieval is feasible and quite promi...

متن کامل

String kernels and similarity measures for information retrieval

Measuring a similarity between two strings is a fundamental step in many applications in areas such as text classification and information retrieval. Lately, kernel-based methods have been proposed for this task, both for text and biological sequences. Since kernels are inner products in a feature space, they naturally induce similarity measures. Information-theoretical approaches have also bee...

متن کامل

Employing Trainable String Similarity Metrics for Information Integration

The problem of identifying approximately duplicate objects in databases is an essential step for the information integration process. Most existing approaches have relied on generic or manually tuned distance metrics for estimating the similarity of potential duplicates. In this paper, we present a framework for improving duplicate detection using trainable measures of textual similarity. We pr...

متن کامل

String Metrics and Word Similarity applied to Information Retrieval

Over the past three decades, Information Retrieval (IR) has been studied extensively. The purpose of information retrieval is to assist users in locating information they are looking for. Information retrieval is currently being applied in a variety of application domains from database systems to web information search engines. The main idea of it is to locate documents that contain terms the u...

متن کامل

A Dependency Treelet String Correspondence Model for Statistical Machine Translation

This paper describes a novel model using dependency structures on the source side for syntax-based statistical machine translation: Dependency Treelet String Correspondence Model (DTSC). The DTSC model maps source dependency structures to target strings. In this model translation pairs of source treelets and target strings with their word alignments are learned automatically from the parsed and...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Progress in Informatics

سال: 2009

ISSN: 1349-8614,1349-8606

DOI: 10.2201/niipi.2009.6.7